Aim: run univariate and bivariate analyses using eQTL data for select loci
Bivariate local \(r_{g}\) analyses revealed several loci (representing LD blocks) where significant \(r_{g}\)'s were found between multiple trait pairs. Of note, some LD blocks contained local \(r_{g}\)'s between neurodegenerative traits and between neuropscyhiatric traits, but with no overlap between the disorder groups. In addition, we observed LD blocks where several traits were associated with one trait in particular, but directions of effect were opposing (e.g. locus 1719, where BIP and MDD were positively correlated with SCZ, while LBD was negatively correlated with SCZ; or locus 2351, where AD and PD were positively and negatively correlated with LBD, respectively). Assuming that these assocations also correlate with expression quantitative trait loci (eQTLs), such differences may be driven by associations to different gene eQTLs.
Following section includes any intermediary code used in this .Rmd.
source(here::here("scripts", "04a_get_gene_loci.R"))All gene loci are shown below.
gene_loci <-
read_delim(
file = here::here("results", "04_eqtl_univar_bivar", "window_100000", "gene_filtered.loci"),
delim = "\t"
)
gene_loci %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')source(here::here("scripts", "04b_preprocess_eqtl.R"))source(here::here("scripts", "04c_get_sample_overlaps.R"))Subset of sample overlap matrix shown below.
sample_overlap <-
read.table(
file = here::here("results", "04_eqtl_univar_bivar", "sample_overlap.txt")
)
as.matrix(sample_overlap)[1:10,1:10]## AD2019 BIP2021 LBD2020 MDD2019 PD2019.meta5.ex23andMe
## AD2019 1.00000 0.00628 0.02589 0.01946 0.01193
## BIP2021 0.00628 1.00000 0.00800 0.05997 -0.00150
## LBD2020 0.02589 0.00800 1.00000 0.00040 0.01281
## MDD2019 0.01946 0.05997 0.00040 1.00000 0.00436
## PD2019.meta5.ex23andMe 0.01193 -0.00150 0.01281 0.00436 1.00000
## SCZ2018 0.01150 0.12928 0.00661 0.03615 0.00295
## EQTLGEN_ENSG00000007047 0.00000 0.00000 0.00000 0.00000 0.00000
## EQTLGEN_ENSG00000007255 0.00000 0.00000 0.00000 0.00000 0.00000
## EQTLGEN_ENSG00000041353 0.00000 0.00000 0.00000 0.00000 0.00000
## EQTLGEN_ENSG00000048028 0.00000 0.00000 0.00000 0.00000 0.00000
## SCZ2018 EQTLGEN_ENSG00000007047 EQTLGEN_ENSG00000007255
## AD2019 0.01150 0 0
## BIP2021 0.12928 0 0
## LBD2020 0.00661 0 0
## MDD2019 0.03615 0 0
## PD2019.meta5.ex23andMe 0.00295 0 0
## SCZ2018 1.00000 0 0
## EQTLGEN_ENSG00000007047 0.00000 1 0
## EQTLGEN_ENSG00000007255 0.00000 0 1
## EQTLGEN_ENSG00000041353 0.00000 0 0
## EQTLGEN_ENSG00000048028 0.00000 0 0
## EQTLGEN_ENSG00000041353 EQTLGEN_ENSG00000048028
## AD2019 0 0
## BIP2021 0 0
## LBD2020 0 0
## MDD2019 0 0
## PD2019.meta5.ex23andMe 0 0
## SCZ2018 0 0
## EQTLGEN_ENSG00000007047 0 0
## EQTLGEN_ENSG00000007255 0 0
## EQTLGEN_ENSG00000041353 1 0
## EQTLGEN_ENSG00000048028 0 1
# Most input info already available, just need to add eqtls
input_info <-
read_delim(
file = file.path(here::here("results", "01_input_prep"),
"input.info.txt"),
delim = "\t"
)
file_paths <-
list.files(
path = here::here("results", "04_eqtl_univar_bivar", "qtl_files"),
pattern = "lava.gz",
full.names = T
)
input_info <-
tibble(
phenotype =
basename(file_paths) %>%
str_remove(".lava.gz"),
cases = rep(1, times = length(file_paths)),
controls = rep(0, times = length(file_paths)),
filename = file_paths
) %>%
dplyr::bind_rows(
input_info
)
write_delim(
input_info,
path = file.path(here::here("results", "04_eqtl_univar_bivar"),
"input.info.txt"),
delim = "\t"
)This was run using nohup:
# Have to navigate to root project folder for script to work (as it uses here package)
cd /home/rreynolds/misc_projects/neurodegen-psych-local-corr
nohup Rscript \
/home/rreynolds/misc_projects/neurodegen-psych-local-corr/scripts/04d_run_univar_bivar_test.R \
&>/home/rreynolds/misc_projects/neurodegen-psych-local-corr/logs/04d_run_univar_bivar_test_multwindows.log&Once run, results can be loaded using the following code chunk:
results <-
setNames(
vector(mode = "list", length = 2),
nm = c("univ", "bivar")
) %>%
lapply(., function(x){
setNames(
vector(mode = "list", length = 2),
nm = c("window_100000", "window_50000")
)
})
for(test in names(results)){
for(window in names(results[[test]]))
results[[test]][[window]] <-
setNames(
object =
list.files(
path =
here::here(
"results",
"04_eqtl_univar_bivar",
window,
test
),
pattern = ".lava.rds",
full.names = T
) %>%
lapply(., function(file) readRDS(file)),
nm =
list.files(
path =
here::here(
"results",
"04_eqtl_univar_bivar",
window,
test
),
pattern = ".lava.rds"
) %>%
str_remove(., "\\..*.lava.rds")
) %>%
purrr::discard(is.null) %>%
qdapTools::list_df2df(col1 = "eqtl_gene") %>%
dplyr::rename(
gene_locus = locus
) %>%
tidyr::separate(
col = eqtl_gene,
into = c("eqtl_dataset", NA),
sep = ":"
) %>%
dplyr::inner_join(
gene_filtered_loci %>%
dplyr::select(
ld_block = locus, gene_id, gene_name
),
by = c("gene_locus" = "gene_id")
) %>%
dplyr::select(
ld_block, eqtl_dataset, gene_locus, gene_name, everything()
) %>%
dplyr::arrange(ld_block, gene_locus, eqtl_dataset) %>%
as_tibble()
}source(here::here("R", "plots.R"))Firstly, we determined a number of LD blocks of interest from the LD blocks highlighted by bivariate local \(r_{g}\) analyses. LD blocks included:
From these LD blocks of interest, we defined gene loci. That is, a 100 kb window was added to the start and end of any gene that overlapped a locus of interest. These gene-defined loci (\(n\) = 92) were carried forward in downstream analyses. Further, to ensure that our results were not driven by window size, we re-ran all analyses using a 50 kb window.
Due to the potential sample overlap between cohorts and its impact on any estimated correlations, it is advised to use cross-trait LDSC (PMID:25642630; PMID:26414676) to obtain an estimate of the sample overlap. This, however, is not possible with eQTL summary statistics. Provided a reasonable certainty that there is no sample overlap between eQTL and GWAS summary statistics, it can be assumed to be zero (which precludes any correlations between eQTLGen and PsychENCODE, as both use GTEx samples, albeit from different tissues). This, however, needs double-checking in terms of eQTL and GWAS cohorts.
For each gene locus, only those traits that were found to have significant local \(r_{g}\) in the associated LD block were used for univariate and bivariate analyses with eQTL summary statistics. As described in 02_run_univar_bivar_test.html, a univariate test was performed as a filtering step for bivariate local \(r_{g}\) analyses. Bivariate local correlations were only performed (i) if the eQTL within the gene locus exhibited a significant univariate local genetic signal and (ii) for pairs of traits which both exhibited a significant univariate local genetic signal. A cut-off of p < 0.0005434783 was used to determine univariate significance. This resulted in a total of **** bivariate tests spanning 55 distinct gene loci. Bivariate results were multiple test corrected using two strategies: (i) a more stringent Bonferroni correction (i.e. p < and (ii) a more lenient FDR correction.
print("Univariate column descriptions:")## [1] "Univariate column descriptions:"
tibble(
column = colnames(results$univ$window_100000),
description =
c(
"LD block",
"eQTL dataset used",
"Gene locus as identified by ensembl ID",
"Gene name",
"Gene locus chromosome",
"Gene locus start base pair",
"Gene locus end base pair",
"The number of SNPs within the locus",
"The number of PCs retained within the locus",
"Analysed phenotype",
"Observed local heritability",
"P-value from the univariate test (F-test for continuous, Chi-sq for binary)"
)
) %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')print("Univariate results (p < 0.05/n_loci):")## [1] "Univariate results (p < 0.05/n_loci):"
results$univ <-
results$univ %>%
lapply(., function(df){
df %>%
dplyr::mutate(
phen =
case_when(
!str_detect(phen, "ENSG") ~ phen %>%
str_replace_all("[:digit:]", "") %>%
str_remove("\\..*"),
str_detect(phen, "ENSG") ~ str_remove(phen, ".*_")
)
)
})
results$univ %>%
qdapTools::list_df2df(col1 = "window_size") %>%
dplyr::filter(p < 0.05/nrow(gene_loci)) %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')results$bivar <-
results$bivar %>%
lapply(., function(df){
df %>%
dplyr::mutate(
across(
.cols = contains("phen"),
~ case_when(
!str_detect(.x, "ENSG") ~ .x %>%
str_replace_all("[:digit:]", "") %>%
str_remove("\\..*"),
str_detect(.x, "ENSG") ~ str_remove(.x, ".*_")
)
)
) %>%
dplyr::mutate(
fdr = p.adjust(p, method = "fdr"),
bonferroni = p.adjust(p, method = "bonferroni")
)
})
# Pivot results
results_long <-
results$bivar %>%
qdapTools::list_df2df(col1 = "window_size") %>%
tidyr::pivot_longer(
cols = -c("window_size", "ld_block", "eqtl_dataset", "gene_locus", "gene_name", "phen1", "phen2"),
names_to = "feature",
values_to = "value"
) %>%
tidyr::pivot_wider(names_from = "window_size")
print("Bivariate column descriptions:")## [1] "Bivariate column descriptions:"
tibble(
column = colnames(results$bivar$window_100000),
description =
c(
"LD block",
"eQTL dataset used",
"Gene locus as identified by ensembl ID",
"Gene name",
"Gene locus chromosome",
"Gene locus start base pair",
"Gene locus end base pair",
"The number of SNPs within the locus",
"The number of PCs retained within the locus",
"Phenotype 1",
"Phenotype 2",
"Standardised coefficient for the local genetic correlation",
"Lower 95% confidence estimate for rho",
"Upper 95% confidence estimate for rho",
"Equivalent of taking the square of rho. Denotes the proportion of variance in genetic signal for phen1 explained by phen2 (and vice versa)",
"Lower 95% confidence estimate for r2",
"Upper 95% confidence estimate for r2",
"Simulation p-values for the local genetic correlation",
"FDR-corrected p-values",
"Bonferroni-corrected p-values"
)
) %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: wrap')print("Bivariate results (FDR < 0.05):")## [1] "Bivariate results (FDR < 0.05):"
bivar_fdr <-
results$bivar %>%
lapply(., function(df) df %>% dplyr::filter(fdr < 0.05))
bivar_fdr %>%
qdapTools::list_df2df(col1 = "window_size") %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')print("Bivariate results (Bonferroni p < 0.05):")## [1] "Bivariate results (Bonferroni p < 0.05):"
bivar_bonf <-
results$bivar %>%
lapply(., function(df) df %>% dplyr::filter(bonferroni < 0.05))
bivar_bonf %>%
qdapTools::list_df2df(col1 = "window_size") %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')overlap <-
bivar_fdr$window_100000 %>%
dplyr::select(
-chr, -start, -stop, -contains("n_"), -contains("lower"), -contains("upper")
) %>%
dplyr::left_join(
results$bivar$window_50000 %>%
dplyr::select(
-chr, -start, -stop, -contains("n_"), -contains("lower"), -contains("upper")
),
by = c("ld_block", "eqtl_dataset", "gene_locus", "gene_name", "phen1", "phen2"),
suffix = c("_100000", "_50000")
) %>%
dplyr::bind_rows(
bivar_fdr$window_50000 %>%
dplyr::select(
-chr, -start, -stop, -contains("n_"), -contains("lower"), -contains("upper")
) %>%
dplyr::left_join(
results$bivar$window_100000 %>%
dplyr::select(
-chr, -start, -stop, -contains("n_"), -contains("lower"), -contains("upper")
),
by = c("ld_block", "eqtl_dataset", "gene_locus", "gene_name", "phen1", "phen2"),
suffix = c("_50000", "_100000")
)
) %>%
select(
ld_block:phen2, contains("rho"), contains("r2"), contains("p"), contains("fdr"), contains("bonferroni")
) %>%
dplyr::distinct()
overlap %>%
dplyr::arrange(ld_block, gene_name) %>%
DT::datatable(rownames = FALSE,
options = list(scrollX = TRUE),
class = 'white-space: nowrap')results_long %>%
dplyr::filter(
!feature %in%
c("chr", "start", "stop", "fdr", "bonferroni", "n_pcs", "n_snps"),
!str_detect(feature, "upper"),
!str_detect(feature, "lower")
) %>%
dplyr::mutate(
across(
.cols = contains("window"),
~ case_when(
feature == "p" ~ -log10(.x),
TRUE ~ .x
)
)
) %>%
dplyr::mutate(
feature =
case_when(
feature == "p" ~ "-log10(p)",
TRUE ~ feature
),
phenotype_corr =
case_when(
str_detect(phen2, "ENSG") ~ "gwas-eqtl",
TRUE ~ "gwas-gwas"
)
) %>%
ggplot(
aes(
x = window_50000,
y = window_100000
)
) +
geom_point(
aes(colour = phenotype_corr),
alpha = 0.4
) +
geom_smooth(
method = "lm",
formula = "y~x",
level = 0.99,
colour = "black",
fill = "grey"
) +
ggpubr::stat_cor(
method = "pearson",
cor.coef.name = "R"
) +
geom_abline(
intercept = 0,
linetype = "dashed",
colour = "#EE442F"
) +
facet_wrap(
vars(feature),
scales = "free"
) +
labs(
x = "50 kb window",
y = "100 kb window"
) +
theme_rhrFigure 4.1: Scatter plot of -log10(p-value), r2 and rho for each pair of phenotypes that could be tested across gene loci with a 50 kb or 100 kb window. Points are coloured by the type of phenotypes correlated e.g. GWAS-GWAS or GWAS-eQTL. The black line represents a linear model fitted to the data, with the 99% confidence interval indicated with a grey fill. The red dashed line represents the line y = x.
results$bivar %>%
qdapTools::list_df2df(col1 = "window_size") %>%
tidyr::pivot_longer(
cols = -c("window_size", "ld_block", "eqtl_dataset", "gene_locus", "gene_name", "phen1", "phen2"),
names_to = "feature",
values_to = "value"
) %>%
dplyr::mutate(
window_size =
window_size %>%
readr::parse_number() %>%
format(scientific = FALSE) %>%
as.factor()
) %>%
dplyr::filter(
feature %in% c("bonferroni", "fdr"),
value < 0.05
) %>%
dplyr::count(window_size, feature, name = "n_total") %>%
dplyr::mutate(
unique =
case_when(
feature == "bonferroni" ~ n_total - (overlap %>% dplyr::filter(bonferroni_100000 < 0.05 & bonferroni_50000 < 0.05) %>% nrow()),
feature == "fdr" ~ n_total - (overlap %>% dplyr::filter(fdr_100000 < 0.05 & fdr_50000 < 0.05) %>% nrow())
),
shared =
n_total-unique
) %>%
tidyr::pivot_longer(
cols = unique:shared,
names_to = "sharing",
values_to = "n"
) %>%
ggplot(
aes(
x = window_size,
y = n,
fill = sharing
)
) +
geom_col() +
facet_grid(cols = vars(feature)) +
labs(
x = "Window size (kb)",
y = "Number of significant bivariate local rg"
) +
theme_rhrFigure 4.2: Number of significant bivariate local genetic correlations across window sizes, using either the stringent bonferroni cut-off or the more lenient FDR cut-off. Bars are coloured by whether local genetic correlations are significant across both window sizes (shared) or only one (unique).
a <-
results_long %>%
inner_join(
overlap %>%
dplyr::filter(bonferroni_100000 < 0.05 & bonferroni_50000 < 0.05) %>%
dplyr::select(ld_block, eqtl_dataset, gene_locus, gene_name, phen1, phen2)
) %>%
dplyr::filter(feature == "r2") %>%
dplyr::rename(
`50` = window_50000,
`100` = window_100000
) %>%
ggpubr::ggpaired(
cond1 = "50",
cond2 = "100",
fill = "condition",
line.color = "gray",
line.size = 0.4,
ggtheme = theme_rhr
) +
labs(
x = "Window size (kb)",
y = "r2"
) +
theme(legend.position = "none")
b <-
results_long %>%
inner_join(
overlap %>%
dplyr::filter(bonferroni_100000 < 0.05 & bonferroni_50000 < 0.05) %>%
dplyr::select(ld_block, eqtl_dataset, gene_locus, gene_name, phen1, phen2)
) %>%
dplyr::filter(
feature %in% c("r2"),
!is.na(window_50000),
!is.na(window_100000)
) %>%
dplyr::mutate(
diff = window_100000 - window_50000
) %>%
ggplot(aes(x = diff)) +
geom_density() +
labs(
x = "Difference in r2 between two window sizes\n(diff = 100 kb - 50 kb)",
y = "Density"
) +
theme_rhr
ggpubr::ggarrange(
a,b,
labels = letters[1:2]
)Figure 4.3: (a) Boxplot displaying r2 estimates between phenotype pairs that demonstrated significant bivariate local genetic correlation (Bonferroni-corrected p < 0.05) across both window sizes. Grey lines indicate paired estimates. (b) Density plot displaying difference in r2 estimates between phenotype pairs that demonstrated significant bivariate local genetic correlation (Bonferroni-corrected p < 0.05) across both window sizes.
data_to_plot <-
bivar_bonf$window_100000 %>%
dplyr::filter(!str_detect(phen2, "ENSG")) %>%
dplyr::distinct(ld_block, phen1, phen2, rho) %>%
dplyr::group_by(ld_block, phen1, phen2) %>%
dplyr::summarise(rho = mean(rho)) %>%
dplyr::bind_rows(
bivar_bonf$window_100000 %>%
dplyr::filter(str_detect(phen2, "ENSG")) %>%
dplyr::distinct(eqtl_dataset, ld_block, phen1, phen2, rho) %>%
dplyr::mutate(
phen2 =
case_when(
str_detect(phen2, "ENSG") ~ eqtl_dataset,
TRUE ~ phen2
)
) %>%
dplyr::select(ld_block, phen1, phen2, rho)
) %>%
dplyr::ungroup()
plot_bivar_chord_diagram(
bivar_corr = data_to_plot,
fct_phen = c(fct_disease, "EQTLGEN", "PSYCHENCODE"),
palette = c(RColorBrewer::brewer.pal(n = 6, name = "BrBG"), "black", "grey")
)Figure 4.4: Chord diagram showing the number of distinct significant bivariate local genetic correlations between each of the traits across all LD blocks (Bonferroni-corrected p < 0.05). Pairs of disease phenotypes (which were tested across multiple genes within an LD block) were collapsed and their average rho within the LD block used. Positive and negative correlations are coloured red and blue, respectively.
data_to_plot <-
bivar_fdr$window_100000 %>%
dplyr::filter(!str_detect(phen2, "ENSG")) %>%
dplyr::distinct(ld_block, phen1, phen2, rho) %>%
dplyr::group_by(ld_block, phen1, phen2) %>%
dplyr::summarise(rho = mean(rho)) %>%
dplyr::bind_rows(
bivar_fdr$window_100000 %>%
dplyr::filter(str_detect(phen2, "ENSG")) %>%
dplyr::distinct(eqtl_dataset, ld_block, phen1, phen2, rho) %>%
dplyr::mutate(
phen2 =
case_when(
str_detect(phen2, "ENSG") ~ eqtl_dataset,
TRUE ~ phen2
)
) %>%
dplyr::select(ld_block, phen1, phen2, rho)
) %>%
dplyr::ungroup()
plot_bivar_chord_diagram(
bivar_corr = data_to_plot,
fct_phen = c(fct_disease, "EQTLGEN", "PSYCHENCODE"),
palette = c(RColorBrewer::brewer.pal(n = 6, name = "BrBG"), "black", "grey")
)Figure 4.5: Chord diagram showing the number of distinct significant bivariate local genetic correlations between each of the traits across all LD blocks (FDR-corrected p < 0.05). Pairs of disease phenotypes (which were tested across multiple genes within an LD block) were collapsed and their average rho within the LD block used. Positive and negative correlations are coloured red and blue, respectively.
results$univ$window_100000 %>%
dplyr::mutate(
phen =
case_when(
str_detect(phen, "ENSG") ~ eqtl_dataset,
TRUE ~ phen
)
) %>%
ggplot(
aes(
x = n_snps,
y = -log10(p)
)
) +
geom_point(alpha = 0.5) +
scale_x_log10() +
scale_colour_manual(
values = c(RColorBrewer::brewer.pal(n = 6, name = "BrBG"), "black", "grey")
) +
facet_wrap(vars(phen)) +
labs(
x = "Number of SNPs within the locus (logarithmic scale)",
y = "-log10(p-value)"
) +
theme_rhrFigure 4.6: Scatter plot of number of SNPs in a tested gene locus and the p-value from the univariate test for each trait.
Significance was determined using the more lenient FDR multiple test correction. Results that do not replicate using the more stringent Bonferroni multiple test correction should be interpreted with caution.
ld_blocks <-
bivar_fdr$window_100000$ld_block %>% unique() %>% sort()
qtl_genes <-
bivar_fdr$window_100000 %>%
dplyr::filter(str_detect(phen2, "ENSG")) %>%
.[["gene_locus"]] %>%
unique()
bivar_list <-
setNames(
bivar_fdr$window_100000 %>%
dplyr::group_split(ld_block),
nm =
str_c("locus_", ld_blocks)
)
ref <- import("/data/references/ensembl/gtf_gff3/v87/Homo_sapiens.GRCh37.87.gtf")
ref <- ref %>% keepSeqlevels(c(1:22), pruning.mode = "coarse")
ref <- ref[ref$type == "gene"]
loci_gr <-
LAVA::read.loci(here::here("results", "01_input_prep", "gwas_filtered.loci")) %>%
dplyr::rename(locus = LOC) %>%
dplyr::filter(locus %in% bivar_fdr$window_100000$ld_block) %>%
GenomicRanges::makeGRangesFromDataFrame(
.,
keep.extra.columns = TRUE,
ignore.strand = TRUE,
seqinfo = NULL,
seqnames.field = "CHR",
start.field = "START",
end.field = "STOP"
)
fig_list <- vector(mode = "list", length = length(ld_blocks))
for(block in ld_blocks){
locus_plot <-
plot_locus(
locus_gr = loci_gr[loci_gr$locus == block],
ref = ref,
highlight_gene = qtl_genes,
highlight_gene_label = "QTL - local rg?"
)
edge_plot <-
plot_qtl_edge_diagram(
bivar_corr_qtl = bivar_list[[str_c("locus_", block)]],
phen = fct_disease,
seed = 89
)
if(length(edge_plot)/4 >=2){
height <- c(1,3.5)
} else{
height <- c(1,1)
}
fig_list[[which(ld_blocks == block)]] <-
ggpubr::ggarrange(
locus_plot,
ggpubr::ggarrange(
plotlist = edge_plot,
ncol = 4,
nrow = ceiling(length(edge_plot)/4),
common.legend = TRUE,
legend = "top"
),
ncol = 1,
labels = letters[1:2],
heights = height
)
names(fig_list)[which(ld_blocks == block)] <- str_c("locus_", block)
}
fig_list[1:4]## $locus_681
Figure 4.7: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (FDR-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (FDR-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
##
## $locus_1273
Figure 4.8: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (FDR-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (FDR-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
##
## $locus_1719
Figure 4.9: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (FDR-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (FDR-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
##
## $locus_2281
Figure 4.10: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (FDR-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (FDR-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
fig_list[5]## $locus_2351
Figure 4.11: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (FDR-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (FDR-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
ld_blocks <-
bivar_bonf$window_100000 %>%
dplyr::group_by(eqtl_dataset, gene_locus) %>%
.[["ld_block"]] %>%
unique() %>%
sort()
qtl_genes <-
bivar_bonf$window_100000 %>%
dplyr::group_by(eqtl_dataset, gene_locus) %>%
dplyr::filter(str_detect(phen2, "ENSG")) %>%
.[["gene_locus"]] %>%
unique()
bivar_list <-
setNames(
bivar_bonf$window_100000 %>%
dplyr::group_split(ld_block),
nm =
str_c("locus_", ld_blocks)
)
fig_list <- vector(mode = "list", length = length(ld_blocks))
for(block in ld_blocks){
locus_plot <-
plot_locus(
locus_gr = loci_gr[loci_gr$locus == block],
ref = ref,
highlight_gene = qtl_genes,
highlight_gene_label = "QTL - local rg?"
)
edge_plot <-
plot_qtl_edge_diagram(
bivar_corr_qtl = bivar_list[[str_c("locus_", block)]],
phen = fct_disease,
seed = 89
)
if(length(edge_plot)/4 >=2){
height <- c(1,3.5)
} else{
height <- c(1,0.5)
}
fig_list[[which(ld_blocks == block)]] <-
ggpubr::ggarrange(
locus_plot,
ggpubr::ggarrange(
plotlist = edge_plot,
ncol = 4,
nrow = ceiling(length(edge_plot)/4),
common.legend = TRUE,
legend = "top"
),
ncol = 1,
labels = letters[1:2],
heights = height
)
names(fig_list)[which(ld_blocks == block)] <- str_c("locus_", block)
}
fig_list[1:4]## $locus_681
Figure 4.12: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (Bonferroni-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (Bonferroni-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
##
## $locus_1273
Figure 4.13: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (Bonferroni-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (Bonferroni-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
##
## $locus_1719
Figure 4.14: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (Bonferroni-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (Bonferroni-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
##
## $locus_2281
Figure 4.15: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (Bonferroni-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (Bonferroni-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
fig_list[5]## $locus_2351
Figure 4.16: (a) Locus plot. Gene loci with a significant bivariate local genetic correlation between a disease phenotype and eQTL are highlighted in blue (Bonferroni-corrected p < 0.05). (b) Edge diagrams for gene loci where a significant bivariate local genetic correlation was observed between a disease phenotype and eQTL (Bonferroni-corrected p < 0.05). Edges display the standardised coefficient for genetic correlation (rho) for significant bivariate local genetic correlations, with negative and positive correlations indicated by blue and red colour, respectively. GWAS and eQTL nodes are indicated by grey and white fill, respectively.
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.5 (2021-03-31)
## os Ubuntu 16.04.7 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_GB.UTF-8
## ctype en_GB.UTF-8
## tz Europe/London
## date 2022-02-01
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date lib source
## abind 1.4-5 2016-07-21 [2] CRAN (R 4.0.5)
## assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.0.5)
## backports 1.3.0 2021-10-27 [1] CRAN (R 4.0.5)
## Biobase 2.50.0 2020-10-27 [2] Bioconductor
## BiocGenerics * 0.36.1 2021-04-16 [2] Bioconductor
## BiocParallel 1.24.1 2020-11-06 [2] Bioconductor
## Biostrings 2.58.0 2020-10-27 [2] Bioconductor
## bit 4.0.4 2020-08-04 [2] CRAN (R 4.0.5)
## bit64 4.0.5 2020-08-30 [2] CRAN (R 4.0.5)
## bitops 1.0-7 2021-04-24 [2] CRAN (R 4.0.5)
## bookdown 0.22 2021-04-22 [2] CRAN (R 4.0.5)
## broom 0.7.10 2021-10-31 [1] CRAN (R 4.0.5)
## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.0.5)
## car 3.0-11 2021-06-27 [2] CRAN (R 4.0.5)
## carData 3.0-4 2020-05-22 [2] CRAN (R 4.0.5)
## cellranger 1.1.0 2016-07-27 [2] CRAN (R 4.0.5)
## chron 2.3-56 2020-08-18 [1] CRAN (R 4.0.5)
## circlize * 0.4.13 2021-06-09 [1] CRAN (R 4.0.5)
## cli 3.1.0 2021-10-27 [1] CRAN (R 4.0.5)
## colorspace 2.0-2 2021-06-24 [2] CRAN (R 4.0.5)
## cowplot 1.1.1 2020-12-30 [2] CRAN (R 4.0.5)
## crayon 1.4.2 2021-10-29 [1] CRAN (R 4.0.5)
## crosstalk 1.1.1 2021-01-12 [2] CRAN (R 4.0.5)
## curl 4.3.2 2021-06-23 [2] CRAN (R 4.0.5)
## data.table 1.14.2 2021-09-27 [1] CRAN (R 4.0.5)
## DBI 1.1.1 2021-01-15 [2] CRAN (R 4.0.5)
## dbplyr 2.1.1 2021-04-06 [2] CRAN (R 4.0.5)
## DelayedArray 0.16.3 2021-03-24 [2] Bioconductor
## digest 0.6.29 2021-12-01 [1] CRAN (R 4.0.5)
## dplyr * 1.0.7 2021-06-18 [2] CRAN (R 4.0.5)
## DT 0.19 2021-09-02 [1] CRAN (R 4.0.5)
## ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.0.5)
## evaluate 0.14 2019-05-28 [2] CRAN (R 4.0.5)
## fansi 0.5.0 2021-05-25 [2] CRAN (R 4.0.5)
## farver 2.1.0 2021-02-28 [2] CRAN (R 4.0.5)
## fastmap 1.1.0 2021-01-25 [2] CRAN (R 4.0.5)
## forcats * 0.5.1 2021-01-27 [2] CRAN (R 4.0.5)
## foreign 0.8-81 2020-12-22 [2] CRAN (R 4.0.5)
## fs 1.5.1 2021-11-30 [1] CRAN (R 4.0.5)
## generics 0.1.1 2021-10-25 [1] CRAN (R 4.0.5)
## GenomeInfoDb * 1.26.7 2021-04-08 [2] Bioconductor
## GenomeInfoDbData 1.2.4 2021-07-03 [2] Bioconductor
## GenomicAlignments 1.26.0 2020-10-27 [2] Bioconductor
## GenomicRanges * 1.42.0 2020-10-27 [2] Bioconductor
## ggforce 0.3.3 2021-03-05 [2] CRAN (R 4.0.5)
## ggplot2 * 3.3.5 2021-06-25 [2] CRAN (R 4.0.5)
## ggpubr 0.4.0 2020-06-27 [2] CRAN (R 4.0.5)
## ggraph * 2.0.5 2021-02-23 [1] CRAN (R 4.0.5)
## ggrepel 0.9.1 2021-01-15 [2] CRAN (R 4.0.5)
## ggsignif 0.6.3 2021-09-09 [1] CRAN (R 4.0.5)
## GlobalOptions 0.1.2 2020-06-10 [1] CRAN (R 4.0.5)
## glue 1.5.1 2021-11-30 [1] CRAN (R 4.0.5)
## graphlayouts 0.7.1 2020-10-26 [1] CRAN (R 4.0.5)
## gridExtra 2.3 2017-09-09 [2] CRAN (R 4.0.5)
## gtable 0.3.0 2019-03-25 [2] CRAN (R 4.0.5)
## haven 2.4.3 2021-08-04 [1] CRAN (R 4.0.5)
## here 1.0.1 2020-12-13 [1] CRAN (R 4.0.5)
## highr 0.9 2021-04-16 [2] CRAN (R 4.0.5)
## hms 1.1.1 2021-09-26 [1] CRAN (R 4.0.5)
## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.0.5)
## htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.0.5)
## httr 1.4.2 2020-07-20 [2] CRAN (R 4.0.5)
## igraph 1.2.7 2021-10-15 [1] CRAN (R 4.0.5)
## IRanges * 2.24.1 2020-12-12 [2] Bioconductor
## jquerylib 0.1.4 2021-04-26 [2] CRAN (R 4.0.5)
## jsonlite 1.7.2 2020-12-09 [2] CRAN (R 4.0.5)
## knitr 1.36 2021-09-29 [1] CRAN (R 4.0.5)
## labeling 0.4.2 2020-10-20 [2] CRAN (R 4.0.5)
## lattice 0.20-44 2021-05-02 [2] CRAN (R 4.0.5)
## LAVA 0.0.6 2021-09-01 [1] xgit (@7be3421)
## lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.0.5)
## lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.0.5)
## magrittr 2.0.1 2020-11-17 [2] CRAN (R 4.0.5)
## MASS 7.3-54 2021-05-03 [2] CRAN (R 4.0.5)
## Matrix 1.3-4 2021-06-01 [2] CRAN (R 4.0.5)
## MatrixGenerics 1.2.1 2021-01-30 [2] Bioconductor
## matrixStats 0.61.0 2021-09-17 [1] CRAN (R 4.0.5)
## mgcv 1.8-36 2021-06-01 [2] CRAN (R 4.0.5)
## modelr 0.1.8 2020-05-19 [2] CRAN (R 4.0.5)
## munsell 0.5.0 2018-06-12 [2] CRAN (R 4.0.5)
## nlme 3.1-152 2021-02-04 [2] CRAN (R 4.0.5)
## openxlsx 4.2.4 2021-06-16 [2] CRAN (R 4.0.5)
## pillar 1.6.4 2021-10-18 [1] CRAN (R 4.0.5)
## pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.0.5)
## polyclip 1.10-0 2019-03-14 [2] CRAN (R 4.0.5)
## purrr * 0.3.4 2020-04-17 [2] CRAN (R 4.0.5)
## qdapTools 1.3.5 2020-04-17 [1] CRAN (R 4.0.5)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.0.5)
## RColorBrewer 1.1-2 2014-12-07 [2] CRAN (R 4.0.5)
## Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.0.5)
## RCurl 1.98-1.5 2021-09-17 [1] CRAN (R 4.0.5)
## readr * 2.0.2 2021-09-27 [1] CRAN (R 4.0.5)
## readxl 1.3.1 2019-03-13 [2] CRAN (R 4.0.5)
## reprex 2.0.0 2021-04-02 [2] CRAN (R 4.0.5)
## rio 0.5.27 2021-06-21 [2] CRAN (R 4.0.5)
## rlang 0.4.12 2021-10-18 [1] CRAN (R 4.0.5)
## rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.0.5)
## rprojroot 2.0.2 2020-11-15 [2] CRAN (R 4.0.5)
## Rsamtools 2.6.0 2020-10-27 [2] Bioconductor
## rstatix 0.7.0 2021-02-13 [2] CRAN (R 4.0.5)
## rstudioapi 0.13 2020-11-12 [2] CRAN (R 4.0.5)
## rtracklayer * 1.50.0 2020-10-27 [2] Bioconductor
## rvest 1.0.1 2021-07-26 [2] CRAN (R 4.0.5)
## S4Vectors * 0.28.1 2020-12-09 [2] Bioconductor
## sass 0.4.0 2021-05-12 [2] CRAN (R 4.0.5)
## scales 1.1.1 2020-05-11 [2] CRAN (R 4.0.5)
## sessioninfo * 1.1.1 2018-11-05 [2] CRAN (R 4.0.5)
## shape 1.4.6 2021-05-19 [1] CRAN (R 4.0.5)
## stringi 1.7.6 2021-11-29 [1] CRAN (R 4.0.5)
## stringr * 1.4.0 2019-02-10 [2] CRAN (R 4.0.5)
## SummarizedExperiment 1.20.0 2020-10-27 [2] Bioconductor
## tibble * 3.1.6 2021-11-07 [1] CRAN (R 4.0.5)
## tidygraph 1.2.0 2020-05-12 [1] CRAN (R 4.0.5)
## tidyr * 1.1.4 2021-09-27 [1] CRAN (R 4.0.5)
## tidyselect 1.1.1 2021-04-30 [2] CRAN (R 4.0.5)
## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.0.5)
## tweenr 1.0.2 2021-03-23 [2] CRAN (R 4.0.5)
## tzdb 0.2.0 2021-10-27 [1] CRAN (R 4.0.5)
## utf8 1.2.2 2021-07-24 [2] CRAN (R 4.0.5)
## vctrs 0.3.8 2021-04-29 [2] CRAN (R 4.0.5)
## viridis 0.6.2 2021-10-13 [1] CRAN (R 4.0.5)
## viridisLite 0.4.0 2021-04-13 [2] CRAN (R 4.0.5)
## vroom 1.5.5 2021-09-14 [1] CRAN (R 4.0.5)
## withr 2.4.3 2021-11-30 [1] CRAN (R 4.0.5)
## xfun 0.27 2021-10-18 [1] CRAN (R 4.0.5)
## XML 3.99-0.8 2021-09-17 [1] CRAN (R 4.0.5)
## xml2 1.3.2 2020-04-23 [2] CRAN (R 4.0.5)
## XVector 0.30.0 2020-10-27 [2] Bioconductor
## yaml 2.2.1 2020-02-01 [2] CRAN (R 4.0.5)
## zip 2.2.0 2021-05-31 [2] CRAN (R 4.0.5)
## zlibbioc 1.36.0 2020-10-27 [2] Bioconductor
##
## [1] /home/rreynolds/R/x86_64-pc-linux-gnu-library/4.0
## [2] /opt/R/4.0.5/lib/R/library